CrossNER: Evaluating Cross-Domain Named Entity Recognition

نویسندگان

چکیده

Cross-domain named entity recognition (NER) models are able to cope with the scarcity issue of NER samples in target domains. However, most existing benchmarks lack domain-specialized types or do not focus on a certain domain, leading less effective cross-domain evaluation. To address these obstacles, we introduce dataset (CrossNER), fully-labeled collection data spanning over five diverse domains specialized categories for different Additionally, also provide domain-related corpus since using it continue pre-training language (domain-adaptive pre-training) is domain adaptation. We then conduct comprehensive experiments explore effectiveness leveraging levels and strategies domain-adaptive task. Results show that focusing fractional containing entities utilizing more challenging strategy beneficial adaptation, our proposed method can consistently outperform baselines. Nevertheless, illustrate challenge this hope baselines will catalyze research adaptation area. The code available at https://github.com/zliucr/CrossNER.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Domain Bootstrapping for Named Entity Recognition

We propose a general cross-domain bootstrapping algorithm for domain adaptation in the task of named entity recognition. We first generalize the lexical features of the source domain model with word clusters generated from a joint corpus. We then select target domain instances based on multiple criteria during the bootstrapping process. Without using annotated data from the target domain and wi...

متن کامل

Bootstrapping and Evaluating Named Entity Recognition in the Biomedical Domain

We demonstrate that bootstrapping a gene name recognizer for FlyBase curation from automatically annotated noisy text is more effective than fully supervised training of the recognizer on more general manually annotated biomedical text. We present a new test set for this task based on an annotation scheme which distinguishes gene names from gene mentions, enabling a more consistent annotation. ...

متن کامل

Named Entity Recognition for the Agricultural Domain

Agricultural data have a major role in the planning and success of rural development activities. Agriculturalists, planners, policy makers, government officials, farmers and researchers require relevant information to trigger decision making processes. This paper presents our approach towards extracting named entities from real-world agricultural data from different areas of agriculture using C...

متن کامل

Exploiting Domain Structure for Named Entity Recognition

Named Entity Recognition (NER) is a fundamental task in text mining and natural language understanding. Current approaches to NER (mostly based on supervised learning) perform well on domains similar to the training domain, but they tend to adapt poorly to slightly different domains. We present several strategies for exploiting the domain structure in the training data to learn a more robust na...

متن کامل

Domain adaptive bootstrapping for named entity recognition

Bootstrapping is the process of improving the performance of a trained classifier by iteratively adding data that is labeled by the classifier itself to the training set, and retraining the classifier. It is often used in situations where labeled training data is scarce but unlabeled data is abundant. In this paper, we consider the problem of domain adaptation: the situation where training data...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i15.17587